CEE 218X Assignment 2
The following is a map view of bay area counties.
For the purpose of this assignment, I am looking specifically at Santa Clara. The following is a map view of Santa Clara.
The following is the population map of Santa Clara in 2010.
The following is subsetting 2010 data
## Reading layer `priority_development_areas_pba2050' from data source
## `https://opendata.arcgis.com/datasets/4df9cb38d77346a289252ced4ffa0ca0_0.geojson'
## using driver `GeoJSON'
## Simple feature collection with 218 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -123.0216 ymin: 36.98737 xmax: -121.5564 ymax: 38.81092
## Geodetic CRS: WGS 84
Estimate the population within the PDAs by bracket-based spatial filtering in the map above, take all those red-outlined blocks, and sum their population.
## [1] 162274
Estimate the 2010 population within the PDAs by subsets to only blocks with their centroid inside of the PDA boundaries,
## [1] 97075
The following is the population map of Santa Clara in 2020.
The following is subsetting 2020 data
## Reading layer `priority_development_areas_pba2050' from data source
## `https://opendata.arcgis.com/datasets/4df9cb38d77346a289252ced4ffa0ca0_0.geojson'
## using driver `GeoJSON'
## Simple feature collection with 218 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -123.0216 ymin: 36.98737 xmax: -121.5564 ymax: 38.81092
## Geodetic CRS: WGS 84
Estimate the 2020 population within the PDAs by bracket-based spatial filtering in the map above, take all those red-outlined blocks, and sum their population.
## [1] 540362
Estimate the 2010 population within the PDAs by subsets to only blocks with their centroid inside of the PDA boundaries,
## [1] 332875
The following is the absolute change in population of Santa Clara between 2010 and 2020.
Findings:
Based on the absolute difference of population between 2010 and 2020 in Santa Clara, there seems to be lots of small pockets of large population change. Most of the Santa Clara area experienced very little population change. However, given the population difference, we don’t have a full picture of what exactly this means- we don’t know the breakdown of migration, births, and deaths so we can’t make a clear conclusion of how much the actual communities changed. Another key finding is my learning curve with R. I had never used R before and thought there was a significant gap between the chapters/in-class R sessions and the assignment. Because of that, I spent a lot of time trying to learn how to combine data tables and manipulate them. This was an insightful R assignment but further in-class/online chapter support would be appreciated for future iterations of this course/future assignments.
Key Assumptions:
Using the data we found on the census, 2010 had more data points than 2020. There may be a variety of reasons for this- ranging from covid-related reasons to changes in how data was collected to changes in neighborhood boundaries. For the purposes of presenting a visual of the difference between 2010 and 2020, I supplemented missing data and “NAs” with a zero value. This isn’t entirely accurate of the reality of population change but it gives a sense of what the data shows and where the gap between data collected and the reality of the situation. Another assumption is the validity, completeness, and accuracy of the data set used. This data set is gathered and produced by the US Census, so we are assuming it is from a credible source on the topic and was gathered in a fair and unbiased way.